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USING  REGRESSION  METHODS 

R.  R.  Read 

Naval  Postgraduate  School 
Monterey,  California 

ABSTRACT 

It  is  shown  that  the  use  of  regression  methods  in  the 
forecasting  of  Separations  (EAOS) , Eligibles  (to  reenlist) 
and  Non-reenlistments  jointly  by  length  of  service  and  pay 
grade  are  competitive  with  the  currently  used  "alpha" 
method.  The  question  of  whether  one  of  the  two  methods  of 
forecasting  is  clearly  superior  could  not  be  addressed  with 
the  currently  available  data.  The  report  describes  the  data 
base,  presents  various  general  characteristics  of  the  data, 
summarizes  the  computational  results  that  lead  to  the 
recommended  choices  of  input,  and  recommends  follow-on  work 
to  clarify  the  issues. 


I. 


Introduction 


Previous  work  explored  the  use  of  regression  techniques 
(especially  ridge  regression)  in  the  forecasting  of  contract 
losses,  gains,  and  attritions  by  a set  of  LOS  (length  of  service) 
categories  and  (separately)  by  PG  (pay  grade)  categories.  For 
sake  of  immediate  reference,  the  results  are  included  as 
Appendix  A of  the  present  report.  The  reader  is  referred  to 
its  first  four  pages  for  the  definition  of  terms  whose  use 
continues  in  the  current  work. 

The  current  follow-on  efforts  ignore  the  gain  and 
attrition  quantities  in  order  to  focus  on  three  important 
components  of  contract  losses: 

i 

S = Separations  (EAOS)  j 

Y = Eligibles 

V = Non-Reenlistments 

At  the  same  time  the  forecasts  are  more  refined  in  that  they 
must  be  made  jointly  by  31  LOS  cells  and  7 PG  cells.  Pay 
grades  El  to  E3  are  lumped  together  in  the  first  cell  and 
pay  grades  E4  thru  E9  form  the  remaining  six  cells. 

In  addition  the  current  study  utilizes  an  increased 
and  modified  data  base.  Now  there  are  eleven  years  of  data 
(1966  to  1976  inclusive)  and  the  definition  of  LOS  has 
changed,  the  new  data  reflecting  this  change  and  some  other 
changes  whose  nature  is  not  explicitly  known  to  the  author. 

These  changes  make  obsolete  the  specific  results  of  the  previous 
work . 
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The  updating  of  the  data  base  has  also  had  an  effect 
on  the  currently  used  method  (which  we  call  the  alpha  method) 
for  forecasting  the  three  object  variables  S,  Y,  and  V. 

The  results  of  this  method  are  available  only  for  1976,  making 
difficult  the  comparison  of  the  regression  forecasts  with 
current  method  forecasts  since  no  actuals  (1977)  are  yet 
available.  Hence  a fair  comparison  of  the  two  methods  cannot 
be  made  at  this  time. 

A biased  comparison  is  made  instead.  All  eleven  years' 
data  is  used  to  develop  the  (1976)  forecast  coefficients  for 
both  methods,  but  are  applied  to  the  previous  yeai's  data  (i.e. 
1975  serving  as  input  data)  to  forecast  1976.  Thus  the  197a 
data  can  serve  as  the  actuals.  The  comparison  is  compromised 
because  the  1976  data  was  used  also  to  produce  the  forecast 
■'oeff  ic Lents . Some  reasons  why  this  deficiency  may  fai'or  the 
ai7'ha  method  will  be  suggested  further  on  in  the  report. 

The  regression  methods  involve  the  use  of  n input 
variables  p ranging  over  2,  3,  and  4 and  the  most  favorable 
set  of  p variables  is  always  selected.  Two  measures  of 
co.mpari  sons  are  computed: 


MAE  = mean  .jbsolute  error 
ilMSE  = root  mean  square  error 

where  the  errors  are  the  differences  between  actuals  and 
forecasts,  and  the  means  are  computed  over  the  217  (31  x 7) 
cells.  The  results  are  summarized  in  Table  1.1.  It  is  con- 
cluded that  the  regression  method  is  competitive  with  the 
currently  used  alpha  method. 
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TABLE  1.1 


COMPARISON  OF  SUMMARY  MEASURES  OF  FORECAST  ERROR 
FOR  THE  'ALPHA  METHOD'  AND  THE 
BEST  MULTIPLE  REGRESSION  USING  p VARIABLES 


alpha 

P = 2 

P = 3 

73 

II 

MAE 

123.5 

161.5 

146.4 

118.4 

Separations 

RMSE 

496.0 

651.7 

576.4 

436.7 

MAE 

104.6 

120.2 

88.3 

75.9 

Eligibles 

RMSE 

380.6 

469.4 

292.9 

232.8 

MAE 

50.6 

77.2 

54.9 

40.3 

Non-Reenlistments 

RMSE 

291.8 

338.0 

238.7 

199.7 

MAE  = mean  absolute  error 
RMSE  = root  mean  square  error. 
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The  report  is  organized  as  follows.  A description  of 
the  new  data  base  and  the  anomolies  remaining  in  it  is  contained 
in  Section  II  along  with  some  general  comments  about  what  the 
author  knows  about  the  alpha  method.  Section  III  contains  some 
comparisons  over  time  of  the  macro  behavior  of  the  five 
important  variables  contained  in  the  new  data.  Some  interpre- 
tations and  speculations  are  made.  Section  IV  contains  the 
refined  details  of  the  comparison  of  regression  methods  and  the 
alpha  method.  Further  interpretations  are  made.  Conclusions 
and  recommendations  follow. 

As  mentioned  earlier,  the  report  of  the  previous  work 
appears  in  Appendix  A.  Eleven  year  means  of  the  three  object 
variables  appear  in  Appendix  B.  Appendix  C contains  APL  programs 
pertinent  to  this  report. 
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II . Description  of  the  Data  and  the  'Alpha'  Method 


The  new  data  contains  eleven  years'  (1966  to  1976  in- 
clusive) data  for  the  five  variables 

V = non-reenlistments 

S = separations  (EAOS) 

Y = eligibles 

X = retentions 

T = inventory  (total) 

for  each  of  the  original  279  (=  31  x 9)  LOS/PG  cells.  (The 
telescoping  of  pay  grades  E1-E3  is  done  later  when  forecasting. 

All  of  the  eleven  years'  data  have  been  rewor)^ed  to 
accommodate  a new  definition  of  LOS.  The  old  definition  was 
based  upon  pay  entry  base  data  which  included  pay  credits  for 
other  federal  service.  The  new  definition  refers  to  TAFMS 
(total  active  federal  military  service) . Although  the  exact 
meaning  is  unknown  to  the  author  fit  is  known  that  reserve  duty 
does  not  count.  (It  appears  that  the  data  of  actual  entry  is 
used  by  no  one.)  The  new  definition  affects  our  previous  LOS 
entries  quite  noticeably. 

Previously,  the  variable  V (non-reenlistments)  was 
a derived  quantity  being  the  difference  of  eligibles  and 
retentions  . Now  it  is  obtained  independently  in  some  way  and 
there  is  noise  in  the  relationship  V = Y-X.  This  and  other 
data  anomolies  are  described  next. 
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Each  of  the  five  variables  V,  S,  T,  X,  T are  recorded 


in  3069  (=  11  x 31  x 9)  cases.  Separations  exceed  Eligibles 
(i.e,  S ^ Y)  in  all  of  them, as  is  proper.  However  eligibles 
fail  to  exceed  retentions  (Y  2^,  X)  iri  97  (of  3069)  cases.  Of 
these,  44  failures  are  in  the  most  recent  two  years  (1975-76) 
and  in  PG,  E8  and  E9.  PG  E7  contains  some  concentration  of  these 
also,  but  the  remainder  appear  to  be  scattered.  The  non-reenlist- 
ment variable  fails  to  be  non-negative  (V  2^  0)  in  107  cases. 
Again  44  of  these  failures  are  in  the  last  two  years  and  for 
E8  and  E9.  In  fact  these  four  cells  (summed  over  LOS)  have  the 
exact  same  distribution  as  the  previous  analomy  (Y  ^ X) . 

Again  E7  has  a number  of  negative  entries  The  differences 
between  separations  and  retentions  (contract  losses)  should  be 
non-negative.  This  fails  (S  2^.  X)  in  76  cases.  The  noise  is 
concentrated  in  PG  E7,  E8,  E9  and  years  1975-76  which  accounts 
for  52  of  the  76  cases.  The  relationship  V = Y-X  (i.e. 
non-reenlistments  form  the  difference  between  eligibles  and 
retentions)  holds  up  in  2256  of  the  3069  cases  (74%).  The 
discrepancies  cluster  mostly  in  E4  to  E7  and  the  first  20  LOS 
categories . 

Finally,  the  new  data  appear  to  have  larger  inventories 
in  1966-67.  Specifically  591  x lo^  vs  586  x lo^  for  1966  and 
662  X 10^  vs  653  x for  1967. 

The  currently  used  method  of  forecasting  the  object 
variables  is  a "black  box"  as  far  as  the  user  is  concerned. 

It  requires  the  production  of  "alpha  matrices"  whose  elements 
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serve  as  multipliers  to  convert  the  input  into  the  forecast. 
Three  such  31  by  7 matrices  are  available  to  the  author, 

A ^ 

a , a , and  are  used  as  follows:  Let  S,  Y,  V represent 

b hi  N 

the  output  projections  for  the  next  full  time  period.  Then 


^ “s'^0 


V = “n^o 


where  Tq  is  the  matrix  of  most  recent  inventories,  Sq  is  the 
matrix  of  separations  for  the  most  recent  period,  and  Yq  is 
the  matrix  of  eligibles  for  the  most  recent  period.  The 
indicated  multiplications  of  matrices  are  elementwise. 

The  alpha  matrices  are  updated  each  year  when  the  data 
become  available.  The  technique  does  not  appear  to  be  well 
documented  but  is  available  in  the  form  of  computer  programs. 

It  is  believed  to  be  a version  of  "Tukey's  Smoothing  Medians" 
and  Tukey's  materials  on  exploratory  data  analysis  may  be 
useful  in  tracking  it  down.  (See  also:  McNeil,  Interactive 
Data  Analysis . ) Comments  concerning  the  application  of  it  to 
the  problem  at  hand  are  contained  in  the  informal  papers 
entitled  'Introduction  to  Smoothing  and  Projection  of  Naval 
Population  Matrix  Time  Series,'  and  'Rif seism  Overview.' 

These  do  not  appear  to  be  very  useful  to  the  analyst. 

The  three  1976  alpha  matrices  used  in  this  study  appear 
in  Tables  2.1  to  2.3.  The  values  have  been  converted  to 
percentages  and  rounded. 
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TABLE  2.1 


ALPHA  MATRIX  MULTIPLIERS  TO  FORECAST 
SEPARATIONS  FROM  INVENTORY  (includes  1976) 


2 

5 

19 

29 

58 

3 

't 

12 

27 

74 

46 

4 5 

2 

23 

37 

28 

33 

4 3 

38 

1 

93 

67 

73 

93 

79 

29 

1 0 

79 

25 

25 

1 7 

22 

36 

30 

2 

41 

41 

29 

38 

31 

47 

99 

14 

1 8 

1 6 

23 

27 

21 

4 

1 8 

22 

22 

3 2 

3 2 

41 

5 

11 

22 

2 5 

32 

52 

32 

1 0 

23 

29 

32 

3 6 

44 

26 

1 8 

1 6 

21 

2 3 

3 0 

43 

12 

6 

36 

3 2 

28 

33 

4 4 

48 

11 

1 4 

1 8 

24 

3 4 

3 8 

37 

23 

25 

24 

27 

32 

29 

32 

45 

28 

22 

26 

31 

29 

3 2 

35 

30 

35 

31 

28 

29 

3 0 

33 

23 

1 6 

19 

2 3 

25 

28 

26 

65 

11 

15 

1 6 

17 

17 

17 

3 

7 

7 

1 0 

16 

19 

22 

1 

1 

7 

1 2 

19 

22 

21 

1 

26 

1 0 

16 

18 

19 

21 

1 

46 

2 5 

1 6 

18 

2 3 

19 

1 

2 

49 

15 

22 

28 

31 

1 

1 

74 

2 0 

21 

19 

21 

1 

1 

82 

42 

1 7 

17 

19 

1 

4 

29 

5 8 

22 

16 

21 

1 

1 

1 3 

27 

1 7 

27 

29 

5 

1 

3 

21 

15 

15 

12 

1 

1 

1 

8 

7 

3 

5 

1 

1 

74 

1 7 

5 

7 

9 

1 

2 

1 3 

27 

8 

1 0 

14 

BEST  AVAIWBLE  COPY 
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TABLE  2.2 


ALPHA  MATRIX  MULTIPLIERS  TO  FORECAST 
ELIGIBLES  FROM  SEPARATIONS  (includes  1976) 


38 

79 

85 

PG 

100 

100 

3 

99 

48 

9 8 

100 

100 

82 

14 

1 

37 

98 

99 

100 

95 

3 

93 

41 

98 

100 

9 8 

100 

14 

1 

53 

98 

99 

1 00 

100 

100 

2 

74 

97 

99 

100 

100 

100 

100 

78 

9 6 

99 

1 00 

9 5 

99 

5 6 

56 

9 7 

99 

100 

1 00 

98 

14 

68 

95 

99 

100 

100 

1 00 

14 

64 

96 

99 

100 

100 

95 

56 

50 

96 

99 

100 

100 

100 

96 

59 

93 

99 

100 

100 

100 

12 

51 

94 

99 

100 

10  0 

100 

56 

68 

8 6 

99 

100 

100 

100 

100 

76 

9 7 

99 

1 00 

100 

100 

100 

9 5 

97 

1 00 

100 

100 

100 

100 

43 

93 

100 

100 

100 

100 

100 

78 

1 00 

100 

1 00 

100 

ICO 

100 

1 6 

100 

9 8 

1 00 

100 

100 

1 00 

1 

5 

94 

100 

100 

10  0 

100 

1 

99 

100 

100 

100 

100 

100 

1 

9 5 

91 

100 

100 

100 

100 

1 

83 

95 

100 

100 

100 

1 00 

1 

45 

85 

100 

100 

100 

1 00 

1 

1 

95 

100 

1 00 

97 

1 00 

0 

83 

100 

1 00 

1 00 

100 

1 00 

0 

1 

17 

1 00 

100 

1 00 

1 00 

0 

0 

17 

100 

100 

100 

1 00 

0 

1 

1 

1 00 

100 

1 00 

100 

0 

0 

87 

99 

100 

99 

95 

0 

2 

96 

99 

97 

100 

10  0 

BESI  'AVAIUBIE  COPY 
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PG 

43 

3 9 

57 

62 

3 

12 

6 8 

14 

4 8 

5 

2 

1 

81 

61 

3 6 

21 

86 

4 

33 

81 

75 

6 0 

19 

54 

1 5 

2 

56 

52 

46 

4 9 

5 3 

99 

3 

3 0 

60 

68 

70 

52 

1 0 

26 

35 

3 7 

46 

46 

19 

5 

56 

1 8 

2 8 

32 

33 

4 

31 

5 

38 

20 

26 

22 

3 

5 

5 

42 

2 0 

23 

1 8 

7 

8 

4 

2 

2 0 

14 

14 

6 

2 

7 

26 

8 

1 5 

1 0 

7 

14 

5 

12 

1 0 

6 

4 

1 

3 

16 

2 3 

5 

4 

2 

4 

5 

3 4 

7 

5 

4 

1 

1 

p 

2 

8 

4 

1 

2 

1 

4 5 

1 

5 

1 

1 

1 

1 

1 

23 

0 

1 

1 

1 

1 

1 0 

1 5 

6 

2 

1 

1 

1 

1 

3 

21 

3 

0 

p 

2 

1 

31 

46 

5 

4 

1 

1 

1 

46 

3 9 

7 

4 

4 

1 

1 

70 

3 4 

6 

1 

1 

1 

1 

46 

1 

1 

p 

1 

1 

1 

1 

1 

16 

5 

1 

1 

1 

70 

2 

19 

7 

3 

1 

1 

1 

1 

2 

1 

1 

2 

1 

1 

2 

22 

5 

1 

1 

1 

1 

3 

4 

2 

1 

1 

1 

1 

1 

1 

1 8 

1 

2 

1 

1 

1 

2 

8 

1 

1 

miABLl  COPY 
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Ill , Plots  Illustrating  the  Macro  Behavior  of  the  Data 


The  five  variables  were  summed  over  LOS  and  PG,  and 
their  behavior  has  been  plotted  against  time,  against  each 
other  for  selected  pairs,  and  the  first  four  against  time  as 
a fraction  of  Inventory.  These  plots,  together  with  comments 
about  them,  follow.  The  totals  appear  first  as  Table  3.1. 


TABLE  3.11 


V 

S 

Y 

X 

T 

1966 

7 3 31  9 

12036  8 

11 7929 

43409 

591  46  3 

1967 

9959  8 

1 5 39  64 

15  0426 

49604 

662056 

1968 

101281 

136697 

133677 

31  34  8 

663779 

1969 

1 2 523  6 

161051 

158008 

31  897 

673589 

1970 

138025 

191050 

180070 

415  9 9 

684109 

1971 

87690 

1 55077 

1 30590 

4 2 5 7 3 

605898 

1972 

741  81 

1 3411 0 
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542298 
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490009 

1975 
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47471 
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45639 
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FIGURE  3.1 
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Total  stocks  of  Navy  enlisted  personnel  during  the 
Vietnam  War  period  and  the  decline  after  that  period  are 
illustrated  in  Figure  3.1.  Total  Separations  (EAOS)  are  shown 
in  Figure  3.2.  The  spike  for  1970  surely  represents  an  'end 
of  the  war'  idiosyncrasy.  Separations  as  a fraction  of  stocks 
(Figure  3.3)  shows  two  spikes  1970  and  1973.  We  do  not  know 
how  much  the  latter  one  affects  the  alpha  matrix  coefficients 
generated  to  produce  a forecast  of  1977  EAOS  as  a multiple 
of  stocks. 
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FIGURE  3.2 


SEPARATIONS  VS  TIME 
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FIGURE  3.3 


SEPARATION  RATE  VS  TIME 
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The  next  four  macro  plots  are  scatter  diagrams  of 
separations  with  each  of  the  other  four  variables.  They  show: 

1)  a moderate  correlation  of  separations  with  inventory, 

2)  stronger  correlations  of  separations  with  non-reenlistments 
and  eligibles, 

3)  virtually  no  correlation  between  separations  and  retentions. 
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FIGURE  3.4 

SEPARATIONS  VS  INVENTORY 
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FIGURE  3.5 

NON-REENLISTMENTS  VS  SEPARATIONS 
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FIGURE  3.6 


retentions  vs  separations 
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FIGURE  3.7 

ELIGIBLES  VS  SEPARATIONS 
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Figure  3.8  has  special  interest  since  the  'alpha  method' 
forecasts  eligibles  as  a proportion  of  separations.  Cyclic 
behavior  begins  about  1970  (post  war) . The  1977  forecast  is 
expected  to  continue  the  downtrend  exhibited  above  in  the  more 
recent  years.  Indeed,  based  on  this  graph  the  forecast  for 
1976  is  expected  to  be  rather  good.  Time  series  smoothing 
methods  have  a tendency  to  do  well  when  the  trend  continues, 
but  thev  are  caught  when  the  series  either  tops  or  bottoms  out. 
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FIGURE  3.9 


ELIGIBLES  VS  TIME 
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FIGURE  3.10 

ELIGIBLES  VS  INVENTORY 
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FIGURE  3.11 
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FIGURE  3.12 

NON-REENLISTMENTS  VS  ELIGIBLES 
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FIGURE  3.13 
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ELIGIBLES  AS  A FRACTION  OF  INVENTORY  VS  TIME 
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Figures  3.9  thru  3,13  focus  on  the  macro  behavior  of 
eligibles.  This  variable  is  more  directly  related  to  policy 
than  are  the  others.  Figure  3.13  suggests  that  the  relevant 
policy  is  not  very  stable.  Again  the  spikes  of  1970  and 
1973  are  the  most  disruptive.  Figure  3.11  suggests  a poor 
but  negative  correlation  of  eligibles  with  retentions. 

Figures  3.7,  3.10  and  3.12  indicate  important  positive 
correlations  of  eligibles  with  separations,  inventory  and  non 
reenlistments.  Figure  3.9  contains  little  new  information 
in  the  light  of  the  others. 
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FIGURE  3.14 


NON-REENLISTMENTS  AS  A FUNCTION  OF  ELIGIBLES  VS  TIME 
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The  'alpha  method'  forecasts  non-reenlistments  as  a 
proportion  of  eligibles.  This  signal  appears  to  have  bottomed 
out  in  1975  and  the  method  is  presumed  to  respond  to  this. 
Although  we  would  expect  the  1976  forecast  (which  is  based  on 
data  thru  1975)  to  be  poor  because  of  the  tendency  to  overswing, 
forecasts  using  data  thru  1976  should  not  suffer  as  much. 
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FIGURE  3.15 
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FIGURE  3.16 
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FIGURE  3.17 


NON- REENLISTMENTS  VS  INVENTORY 
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FIGURE  3.18 

NON- REENLISTMENTS  VS  RETENTIONS 
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Figures  3.15  and  3.16  show  macro  behavior  of  non-reenlist- 
ments and  non-reenlistment  rate  as  a function  of  time.  They  are 
readily  interpretable  in  the  light  of  war  and  post-war  years. 
Figures  3.5,  3.12,  3.17,  and  3.18  are  the  scatter  plots  of 
non-reenlistments  with  each  of  the  other  four  variables.  They 
show  positive  correlations  of  non-reenlistments  with  inventory, 
separations,  and  eligibles,  while  the  relationship  with 
retentions  is  negative  and  appear  noisy.  Recall  that  the 
relationship  V = Y-X  breaks  down  in  26%  of  the  cases. 

FIGURE  3.19 
RETENTIONS  VS  TIME 
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figure  3.20 


RETENTIONS  AS  A FRACTION  OF  INVENTORY  VS  TIME 
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FIGURE  3.21 

RETENTIONS  VS  INVENTORY 
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For  completeness,  the  macro  behavior  of  retentions  is 
included.  Figures  3.19  and  3.20  show  somewhat  smooth  cyclical 
structure.  The  scatter  plots  with  the  other  four  variables 
appear  in  Figures  3.6,  3.11,  3.16  and  3.21.  All  four  indicate 
modestly  negative  correlations. 
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IV.  Supporting  Details  of  the  Comparison 


The  data  bank  for  any  of  the  object  variables  may  be 
viewed  as  a time  ordered  set  of  eleven  31  by  7 matrices.  As 
we  move  thru  time,  the  numbers  in  these  matrices  will  roll  and 
twist  as  in  a wave  motion  that  reflects  the  changing  size  of 
the  total  enlisted  force  and  how  those  changes  affect  the  various 
cells.  The  set  of  inventory  matrices  form  the  base  of  the  stocks 
of  people.  Although  individual  people  are  not  tracked  in  this 
set,  it  is  helpful  to  draw  attention  to  the  fact  that  an  indi- 
vidual changes  LOS  cell  each  year  and  PG  cell  periodically.  Thus 
in  these  recent  years  of  drawdown  (in  size)  one  expects  relatively 
higher  exits  from  the  lower  LOS  cells.  Such  motion  is  reflected 
also  in  the  other  four  variables  since  inventory  provides  the 
base  of  stocks  on  which  each  of  the  others  draw. 

A little  more  detail  concerning  the  three  objective 
variables  can  be  obtained  by  studying  Tables  4.1  thru  4.3  which 
compare  their  changes  over  the  most  recent  two  years  for  the 
first  18  LOS  cells  and  all  7 PG  cells.  The  macro  behavior  of 
EAOS  as  shown  in  Figure  3.2  suggests  that  recently  a smooth 
decline  has  taken  place.  Table  4.1  shows  that  these  changes 
have  been  rather  drastic  in  the  first  two  columns  and  the  lower 
LOS  cells.  Since  the  'alpha'  method  smooths  the  time  series 
cell  by  cell,  it  is  expected  to  perform  well  in  those  cells 
that  sustain  a trend  and  poorly  in  those  that  do  not. 
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TABLE  4.1 


LOS 


LOS 


RECENT  BEHAVIOR  OF  EAOS  SEPARATIONS  1975 


El-3 

E4 

E5 

E6 

E7 

E8 

E9 

2019 

7 00 

1 8 

5 

10 

0 

3 

7 59  0 

4940 

32  3 

3 0 

2 

1 

0 

7796 

7126 

2205 

2 5 

3 

0 

3 

5627 

17820 

12  9 0 6 

1 1 5 

7 

1 

0 

551 

1666 

2038 

135 

8 

1 

0 

466 

1270 

3712 

722 

10 

1 

2 

1 35 

209 

1216 

62  6 

12 

1 

2 

67 

160 

1053 

1 049 

1 8 

2 

0 

49 

1 89 

1094 

1505 

6 8 

2 

0 

37 

2 59 

1193 

15  80 

125 

1 

1 

12 

98 

600 

1379 

249 

2 

0 

9 

70 

54  3 

1555 

408 

17 

2 

3 

38 

447 

1921 

624 

3 6 

1 

8 

72 

465 

183  3 

717 

74 

5 

4 

42 

314 

169  4 

9 08 

167 

20 

5 

48 

288 

150  5 

1092 

23  5 

42 

0 

13 

1 3 2 

1089 

1147 

322 

69 

3 

1 0 

91 

672 

799 

242 

62 

SEPARATIONS  1976 


El-3 

E4 

E5 

E6 

E7 

E8 

E9 

624 

70 

1 8 

7 

6 

0 

0 

6017 

2942 

1 9 0 

23 

5 

0 

2 

143  52 

12  092 

1140 

21 

8 

0 

2 

5906 

1 5 9 4 0 

15649 

54 

2 

0 

1 

469 

1471 

1894 

103 

8 

1 

1 

221 

12  80 

3 3 79 

823 

7 

1 

1 

78 

21  0 

1118 

620 

1 4 

1 

0 

37 

175 

1172 

1205 

20 

4 

0 

19 

129 

75  0 

1168 

76 

6 

0 

35 

192 

1096 

1671 

1 29 

2 

0 

11 

89 

631 

1093 

1 80 

1 

1 

7 

74 

488 

1132 

264 

1 1 

0 

2 

20 

25  2 

1079 

385 

1 7 

1 

3 

25 

310 

1219 

4 71 

39 

5 

3 

26 

337 

1180 

61  5 

55 

5 

3 

27 

233 

1010 

638 

94 

19 

2 

1 1 

1 1 5 

671 

61  3 

132 

3 0 

2 

4 

6 0 

404 

49  0 

124 

29 

28 


TABLE  4.2 


LOS 


LOS 


RECENT  BEHAVIOR  OF  ELIGIBLES 


Eligibles  1975 


El-3 

E4 

E5 

E6 

E7 

E8 

E9 

894 

685 

1 6 

5 

1 0 

0 

3 

338  6 

4814 

32  3 

30 

1 

1 

0 

2879 

69  41 

217  5 

25 

3 

0 

3 

2265 

17372 

1 2 7 8 6 

1 1 2 

7 

1 

0 

284 

162  3 

2003 

1 34 

8 

1 

0 

37  0 

1221 

3 6 71 

7 1 6 

1 0 

1 

2 

106 

2 02 

1 203 

6 22 

1 1 

1 

2 

46 

157 

1039 

1047 

1 8 

1 

0 

35 

178 

1084 

1497 

68 

2 

0 

25 

248 

11  72 

1577 

12  5 

0 

1 

5 

9 6 

59  4 

13  71 

249 

2 

0 

4 

67 

53  8 

1 549 

408 

17 

1 

1 

37 

4 38 

1915 

62  4 

36 

1 

7 

72 

4 61 

1828 

712 

74 

5 

3 

40 

311 

1 6 8 8 

907 

1 67 

20 

5 

4 5 

288 

1 5 92 

1 091 

235 

4 2 

0 

12 

1 31 

1085 

1147 

322 

68 

3 

9 

91 

6 72 

799 

242 

62 

Eligibles  1976 


El-3 

E4 

E5 

E6 

E7 

E8 

E9 

217 

51 

1 5 

7 

6 

0 

0 

2739 

287  6 

1 9 0 

2 3 

4 

0 

2 

5124 

11774 

112  3 

21 

7 

0 

2 

2 390 

1 54  34 

15516 

5 3 

2 

0 

1 

264 

1425 

185  4 

103 

8 

1 

1 

1 31 

12  33 

33  38 

81  7 

7 

1 

1 

57 

199 

1107 

61  8 

1 4 

1 

0 

18 

167 

1157 

1197 

20 

4 

0 

11 

126 

739 

1163 

7 6 

6 

0 

21 

182 

1078 

1660 

129 

2 

0 

7 

84 

620 

1 090 

1 80 

1 

1 

4 

68 

4 81 

1 1 21 

263 

11 

0 

1 

1 8 

252 

107  6 

38  3 

1 7 

0 

2 

20 

305 

1216 

469 

39 

5 

3 

25 

332 

1 1 77 

61  1 

55 

5 

3 

27 

2 31 

1 007 

638 

94 

19 

2 

1 0 

1 1 5 

6 69 

61  2 

1 32 

30 

1 

4 

60 

482 

489 

1 2 4 

29 

29 


The  macro  behavior  of  eligibles  as  shown  in  Figure  3.9 
also  declines  smoothly  but  Table  4.2  shows  some  sharp  drops  for 
LOS  = 1 and  some  sharp  increases  for  LOS  =3.  In  the  former  case 
the  advantage  is  to  the  alpha  method  and  in  the  latter  it  is  not. 
Regression  methods  must  look  elsewhere  to  pick  up  these  signals. 

According  to  Figure  3.15  Non-reenlistments  level  off  in 
the  macro  sense.  Table  4.3  shows  some  rather  drastic  movements 
for  the  lower  LOS  and  PG  cells.  Again  LOS  = 3 shows  some 
sharp  increases. 

The  forecasting  of  the  object  variables  using  regression 
methods  was  performed  repeatedly  to  meet  several  goals:  First, 
the  best  set  of  p variables  had  to  be  identified.  Second, 
the  influence  of  the  ridge  constant  (see  Appendix  A)  needed 
some  accounting.  Third,  the  stability  of  the  forecast  where 
1976  data  was  not  included  in  the  forecast  required  examination. 

Table  4.4  contains  a listing  of  the  best  set  of  p 
variables.  The  subscripts  indicate  time  and  the  object  variable 
appearing  in  the  set  with  subscript  t-1  indicates  an  auto- 
regressive contribution  with  a lag  of  one  year.  The  correspond- 
MAE  and  RMSE  values  appear  in  Table  1.1. 

The  resulting  forecasting  of  separations  does  not  depend 
very  much  on  whether  or  not  1976  data  are  included  in  the 
forecast  coefficients  (as  measured  by  the  MAE  and  RMSE  values) , 
but  the  inclusion  does  make  a noticeable  difference  in  fore- 
casting eligibles  and  non-reenlistments.  This  result  was  not 
anticipated  from  study  of  the  macro  plots.  It  may  be,  in  part, 
a consequence  of  ill-conditioning  (see  Appendix  A) . 
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TABLE  4 . 3 


LOS 


LOS 


RECENT  BEHAVIOR  OF  NON-REENLISTMENTS 


Non 

-Reenlistments 

1975 

El-3 

E4 

E5 

E6 

E7 

E8 

E9 

117 

56  0 

6 ■ 

2 

6 

0 

3 

29  76 

3125 

4 9 

7 

"2 

0 

0 

2401 

4103 

764 

7 

3 

0 

1 

1 R72 

1 3070 

6 9 69 

1 9 

4 

1 

0 

171 

977 

963 

51 

2 

1 

0 

91 

736 

2 3 5? 

46'’ 

5 

0 

1 

30 

74 

536 

261 

2 

0 

2 

12 

53 

277 

335 

~2 

”1 

0 

13 

37 

279 

3 02 

3 

0 

“l 

9 

51 

2 61 

2 56 

11 

”1 

0 

0 

12 

73 

156 

1 3 

"l 

0 

1 

6 

7 5 

139 

1 3 

1 

0 

0 

9 

20 

3 9 

12 

“6 

0 

1 

6 

2 4 

6 0 

9 

4 

”2 

3 

2 

1 6 

2 2 

~2 

"2 

"l 

0 

1 

9 

2 3 

1 

1 

”9 

0 

0 

6 

6 

"l  0 

“5 

~ 9 

~1 

0 

9 

1 

"1  9 

~4 

0 

Non 

-Reenlistments 

1976 

El-3 

E4 

E5 

E6 

E7 

E8 

E9 

133 

14 

6 

4 

5 

0 

0 

23  59 

1957 

2 6 

1 3 

1 

"l 

2 

4137 

915  3 

370 

2 

6 

9 

1 

1962 

110  9 0 

996  7 

1 9 

1 

0 

0 

1 52 

650 

7 64 

47 

4 

1 

1 

52 

667 

23  65 

553 

4 

0 

0 

2 5 

17 

509 

2 94 

3 

1 

0 

~1 

4 1 

3 64 

3 96 

"3 

2 

0 

4 

20 

172 

242 

0 

2 

0 

10 

31 

221 

297 

6 

0 

~1 

2 

19 

79 

140 

1 3 

~1 

“l 

1 

2 

40 

0 4 

1 7 

2 

0 

0 

2 

23 

47 

1 2 

"l 

1 

7 

9 

2 6 

2 

1 

1 

3 

5 

20 

0 

“3 

"2 

1 

2 

6 

~1 

“6 

0 

~3 

1 

0 

1 

~3 

~9 

*1 

"l 

0 

1 

0 

"l 

31 

■ 6 

~ 9 

“l 
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TABLE  4 . 4 

VARIABLE  SELECTION  FOR  REGRESSION  FORECASTS 


Object  Variable 

p = 2 

P = 3 

p = 4 

“ “ I 

Separation 

^t'Vl 

^t'Vl'Vl 

'^t'^t-1  ''^t-1  '^t-1 

Eligibles  Y^ 

^'\-l 

\'Vi-Vi 

"^t'^^t-l'^t-l  ''^t-l 

Non-Reenlistments  V^ 

\'Vl 

Tt«V,-i,Y^_l 

\'Vi'Vi'\-i 

The  use  of  a ridge  constant  in  multiple  regression 
serves  to  stablilize  the  regression  coefficients  when  the  re- 
gression variables  are  highly  correlated  (i.e.  the  problem  is 
ill-conditioned).  The  picture  is  cloudy,  but  generally  its 
use  is  noticeable  in  forecasting  separations  and  eligibles 
but  not  so  much  in  forecasting  non-reenlistments.  A high 
level  of  ill-conditioning  is  anticipated  whenever  V,  Y,  and  X 
are  used  in  concert  because  of  the  logical  relationship  V = Y-X. 
Also  a non-zero  ridge  constant  may  be  appropriate  when  p = 2 
for  eligibles  and  non-reenlistments.  The  value  RC  = .025 
in  the  space  of  standardized  regression  variables,  seems  to 
be  a reasonable  choice  for  the  ridge  constant,  but  overall, 
the  ill-conditioning  and  stability  merits  further  study. 

The  remainder  of  this  section  is  devoted  to  the 
numerical  comparison  of  'alpha'  forecasts  with  regression 
forecasts  for  p = 4 in  the  case  of  separations,  and  for 
p = 3 for  the  other  two  variables.  In  the  latter  cases  the 


32 


contribution  of  the  fourth  variable  is  suspect  because  of  the 
relationship  V = Y-X  and  the  improvement  indicated  in  Table  1.1 
may  be  unstable. 

The  computation  of  multiple  regressions  individually 
for  each  of  the  217  (31  x 7)  cells  is  too  time  consuming  and 
some  grouping  was  necessary.  The  grouping  chosen  was  based  on 
the  following  partitioning  of  the  eleven  year  averages  for  each 
of  the  object  variables.  Cells  were  treated  individually  as 
long  as  the  eleven  year  averages  were  at  least  1000  (1900  in 
the  case  of  eligibles) . Then  cells  were  grouped  together  in 
decrements  of  100  down  to  an  eleven  year  average  of  100.  From 
then  on  the  groupings  were  rn  decrements  of  10  with  some  arbitrary 
adjustments  when  zero  was  approached. 

Table  4.5  compares  the  (rounded)  'alpha'  forecasts  and 
regression  forecasts  (on  T^,  separations. 

The  ridge  constant  was  .025  and  all  data  were  used  in  develop- 
ing the  regression  coefficients.  The  (rounded)  errors  of  the 
two  forecasts  are  compared  in  Table  4.6. 

Let  us  consider  Table  4.6  in  the  light  of  the  recent 
changes  in  separation  shown  in  Table  4.1.  For  LOS  = 1 
separations  experienced  a severe  drop  in  1976  in  the  first  two 
columns.  The  "alpha"  method  pierced  this  up  amazingly  well. 

Of  course  this  success  is  attributed  to  the  use  of  1976  data  in 
the  development  of  the  coefficients.  The  same  effect  is 
exhibited  in  LOS  = 2 for  E4 . The  first  two  entries  for 
LOS  = 3 in  Table  4.1  show  a sharp  increase.  This  is  a trend 
reversal  that  the  alpha  method  does  not  respond  to  so  rapidly 
even  under  these  advantageous  conditions.  It  makes  its  worst 
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Tables  4 . 7 and  4 . 8 compare  the  forecasts  and  errors  of 
forecasts  of  eligibles  for  the  'alpha'  and  regression  (on 
T^,  methods.  Again  when  we  consider  the  major 

changes  in  1976  by  viewing  Table  4.2  we  see  a sharp  drop  in 
columns  1 and  2 for  LOS  = 1,  a substantial  drop  in  these  columns 
for  LOS  = 2 and  an  increase  for  LOS  = 3.  Again  the  alpha 
method  picks  this  up  extremely  well  for  LOS  = 1,  adequately  well 
for  LOS  = 2,  and  responds  poorly  to  the  turnaround  for  LOS  = 3. 
The  regression  method  does  surprisingly  well  for  LOS  = 1 and 
column  1. 

This  same  phenomenon  continues  when  comparing  the  fore- 
i,-asts  of  non-reenlistmcnts . Table  4.3  shows  a sharp  drop  in 
LOS  = 1 that  the  alpha  method  picks  up  well,  a drop  in  cell  (2,2) 
is  not  so  well  forecast.  The  increases  for  LOS  = 3 and  cell 
(4,3)  are  again  handled  much  better  by  the  regression  methods. 
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V Conclusions  and  Recommendations 

A fair  comparison  of  the  two  methods  could  not  be  made 
on  the  one  hand  because  1977  actuals  are  not  yet  available  and, 
on  the  other  hand,  because  the  alpha  matrices  for  1975  were  not 
available.  The  alpha  method  appears  to  be  a time  series  type 
method  that  possesses  the  typical  lag  characteristics  at  peaks 
and  troughs.  The  regression  methods  appear  to  be  competitive 
and  have  the  feature  of  being  more  stable. 

Some  examination  of  the  residuals  (over  time)  of  the 
regression  methods  took  place.  Although  some  firm  lag  correla- 
tions are  present,  they  do  not  appear  to  be  significant,  based 
on  the  computation  of  Durbin-Watson  statistics.  Thus  a hybrid 
system  using  both  time  series  and  regression  methods  is  not 
expected  to  produce  highly  substantial  improvements. 

It  is  recommended  that  a fair  comparison  be  made  and 
that  an  appropriate  measure  of  the  cost  of  forecast  error  be 
developed  for  it.  The  real  extent  of  time  series  "overswing" 
would  become  apparent  and  its  effect  would  be  examined  in 
more  realistic  terms. 

It  is  desirable  to  develop  the  regression  approach 
further.  The  question  of  the  correct  set  of  ridge  constants 
needs  be  faced  more  carefully  as  well  as  the  question  of  the 
grouping  of  cells.  The  residuals  should  be  examined  more 
carefully  with  one  eye  focused  on  the  time  dependence  behavior 
and  a second  eye  looking  for  suitable  transformations  to 
remove  skewness  and  stabilize  the  variance. 
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APPENDIX  A 


FORECASTING  MANPOWER  CHANGES  USING  REGRESSION  METHODS 

Data  and  Notation; 

Ten  years  data  are  available,  specifically  1966-1976  (fiscal 
year  begins  on  1 July  of  previous  year).  The  first  subscript,  t, 
will  index  the  years  (t  = 1,...,10)  with  t = 1 referring  to 
fiscal  1966  (1  July  1965  to  30  June  1966) , etc.  The  second  and 
third  subscripts  refer  to  length  of  service  (LOS)  and  pay  grade  (PG) , 
categories  respectively,  using  the  indices  i and  j.  Basically 
then  the  31  LOS  categories  are  interpreted  as  follows:  An  enlisted 
man  with  LOS  = i is  one  who  on  1 July  of  that  period  had  completed 
at  least  i - 1 years  of  service  but  less  than  i,  for  i = 1,...,30. 
If  i = 31  then  the  number  of  years  completed  service  is  "at  least 
30."  There  are  nine  PG  categories  referring  to  the  pay  grades 
El, . . . ,E9. 

The  resulting  data  arrays  were  too  cumbersome  for  our  explo- 
ratory worlc  and  some  arbitrary  grouping  was  imposed.  Specifically, 

LOS  categories  6 through  17  were  grouped  together  as  were  categories 
18  through  31.  Also  PG  categories  El,  E2,  E3  were  aggregated. 

Thus  the  present  study  treats  seven  LOS  groups  and  seven  PG  groups. 
Notice  that  the  detail  lost  in  aggregating  the  high  LOS  cells  is 
partially  recovered  because  the  corresponding  (high)  PG  cells  are 
intact.  Similarly,  the  intactness  of  the  low  LOS  cells  retains 
some  of  the  information  lost  by  aggregating  the  low  PG  cells. 

The  following  notation  was  adopted  for  the  given  data. 

G gains 
A attritions 
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R retirements 

X retentions  (re-enlistments) 

Y eligibles 

S separations  (EAOS)  (S  > Y > X) 

T inventory  (total  Navy  enlisted) 

Some  additional  derived  quantities  are  useful. 

U = S - Y ineligibles 

V = Y - X non-reenlistments 

W=U+V=S-X  contract  losses 

Some  explanation  of  these  quantities  is  helpful.  Gains 
refer  to  the  number  of  people  from  outside  the  Navy  that  enter  the 

enlisted  Navy  during  the  fiscal  year  in  question,  in  each  LOS,  PG 
category  used.  New  recruits  are  not  included  since  they  need  not 
be  forecast.  Promotions  represent  internal  movement  and  are  also 
not  reflected  in  the  gains  used  here.  Changes  which  are  included 
in  gains  are  those  persons  who  reenter  the  service  after  having  left, 
under  programs  called  continuous  service  or  broken  service  reenlist- 
ment contracts.  A category  called  miscellaneous  gains  is  also 
included  here,  representing  gains  by  various  methods,  not  including 
the  recruits. 

Attritions  refer  mainly  to  people  who  are  dismissed  prior 
to  the  expiration  of  their  contract.  It  also  includes  deaths, 
disability  discharges,  etc.  Retirements  begin  in  LOS  category 
18.  The  only  means  of  leaving  the  Navy  aside  from  attrition  and 
retirement  is  by  failure  to  reenlist  at  the  expiration  of  the 
contract,  i.e.,  contract  loss.  All  personnel  are  separated  at  the 
end  of  their  contract.  Not  all  separated  personnel  are  declared 
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eligible  for  reenlistment  and  not  all  eligibles  exercise  their 
option  to  reenlist.  Hence  the  inequality  S > Y > X.  The  derived 
differences  (S  - Y and  Y - X)  are  called  ineligibles  and  non 
reenlistments,  respectively.  The  sum  of  these  two  are  the  contract 
losses . 

To  all  the  variables  may  be  affixed  the  subscripts  t,  i,  j 
which  refer  to  periods  and  categories  already  described-  All  of 
the  variables  (except  T)  are  interval  functions  and  refer  to  the 
net  result  of  a time  period  (fiscal  year) . The  variable  T 
referring  to  the  total  size  of  the  Navy  is  a point  or  "snapshot" 
variable  and  refers  to  the  number  of  personnel  on  board  on  the 
first  day  of  the  designated  period  (left  end  point  or,  more  specifi- 
cally, 1 July  of  the  fiscal  year) . 

Scope  of  Current  Study 

Reg’'‘.csion  methods,  specifically  ridge  regression,  are  applied 
to  the  forecisting  of  contract  losses,  gains,  and  attrition  by  total 
numbers  in  each  LOS  group  and  in  each  PG  group.  All  data  sets  of 
variables  are  three  dimensional  arrays.  Since  the  present  worl^  is 
concerned  with  exploring  the  usefulness  of  a methodology,  the  dimen- 
sions of  the  data  sets  were  reduced  to  two  in  order  to  obtain 
simplicity  and  uniformity.  This  was  done  in  two  different  ways 
since  LOS  groups  and  PG  groups  have  separate  interest.  Thus  when 
studying  the  predictability  of  the  LOS  groups,  all  data  arrays  were 
summed  over  j = !,...,?  yielding  time  by  LOS  group  matrices. 
Similarly  when  studying  the  predictability  of  the  PG  groups,  all 
data  arrays  were  summed  over  i = !,...,?  yielding  time  by  PG  group 
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matrices.  No  combining  or  mixing  of  the  two  kinds  of  groups  took 
place  in  this  study.  Thus  we  attempted  to  predict  the  various 
changes  by  LOS  and  by  PG  marginally,  but  not  jointly. 

The  choice  of  regression  variables  was  made  on  hueristic 
grounds.  The  volume  of  data  is  quite  limited  (ten  time  values) 
and  this  limits  one  to  the  use  of  only  four  or  five  regression 
variables  so  that  at  least  a few  degrees  of  freedom  remain  for 
estimating  the  mean  square  error.  Also  any  autoregressive  feature 
was  limited  to  the  single  most  recent  time  period  (or  time  point, 
in  the  case  of  the  variable  T) . Thus  the  ten  time  periods  still 
yield  nine  full  sets  of  observations.  The  degrees  of  freedom  for 
estimating  error  are  given  by  n - p - 1 where  n = 9 and  p = the 
number  of  regression  variables.  The  choice  of  variables  (same  for 
each  of  the  two  kinds  of  groups  LOS  and  PG)  appear  below. 

Contract  Losses  regressed  on  Ineligibles,  Reenlistments, 
Separations,  Total  Inventory. 

Gains  regressed  on  Gains,  Attritions,  Ineligibles,  Separations, 
Total  Inventory. 

Attritions  regressed  on  Attritions,  Ineligibles,  Reenlistments, 
Total  Inventory. 

Further  exploration  could  yield  a better  set  of  regression 
variables.  The  results  so  far  are  encouraging  as  will  be  seen. 

Methodology 

The  development  of  ridge  regression  in  the  last  five  years 
(see  Ref.  3,  4,  7)  is  proving  to  be  an  important  step  in  treating 
the  anomalies  of  regression  problems.  It's  use  is  especially 
attractive  where  the  correlation  matrix  of  the  regression  variable 
is  highly  non  orthogonal--a  condition  that  is  met  liberally  in  the 
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present  problem.  The  key  to  its  successful  application  is  in  the 
selection  of  the  ridge  constant  k.  Current  thinking  on  this 
question  recognizes  several  competing  forces  and  suggests  the  selec- 
tion of  a range  of  values  for  k in  which  all  the  forces  are  rather 
stable . 

More  specifically,  these  requirements  may  be  summarized  as 
follows : 

(i)  The  variance  inflation  factor  of  the  estimates  of  the 
regression  coefficients  should  be  at  least  one  but  certainly  not 
as  large  as  ten,  (Ref.  6,  p.  609ff.;  Ref.  7). 

(ii)  The  ridge  trace  should  be  stable.  This  includes  the 
accomplishment  of  all  reversals  in  sign  with  respect  to  the 
initial  signs  at  k = 0 (ordinary  least  squares  regression) , (Ref.  4) 

(iii)  The  mean  square  error  (MSB)  of  forecast  should  not  have 
increased  greatly  beyond  the  initial  values  at  k = 0,  (Ref.  4,  7). 

These  three  requirements  are  the  author's  set  of  guidelines 
formed  from  the  materials  in  the  references.  The  first  deals  with 
the  variance  inflation  factors  which  are  defined  by  Marquardt,  as 
as  the  diagonal  elements  of  (see  Ref.  6 p.  609) 

[X'X+kl]”^ (X'X) [X'X+kI]“^ 

when  X'X  is  in  correlation  form  (i.e.,  correlation  matrix  of  the 
regressive  variables).  For  our  set  of  problems,  criterion  (i)  is 
met  uniformly  for  k ^ .02. 

The  ridge  trace  is  the  vector  of  regressive  coefficients 
viewed  as  functions  of  k,  the  ridge  constant  (Ref.  3).  Typically, 
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they  are  unstable  and  change  sharply  as  k moves  away  from  zero,  but 

settle  down  after  that.  It  is  recommended  that  k be  large  enough 

so  that  any  regression  coefficient  that  is  going  to  change  its  sign 

from  that  at  k = 0 , be  allowed  to  do  it.  Much  of  the  information 

2 

in  the  ridge  trace  is  summarized  by  L (k) , the  squared  length  of 
the  regression  vector  of  coefficients.  The  ridge  trace  is  also 
valuable  if  one  is  selecting  variables  for  deletion. 

For  our  set  of  problems,  many  of  the  ridge  traces  settle  down 
quite  quickly  by  the  time  k reaches  .04.  There  are  some  strag- 
glers however,  but  even  so  all  have  stabalized  by  the  time  k has 
reached  .2. 

Accordingly,  the  range  .05  ^ k .2  was  chosen  for  further 

study.  Typically  the  MSE  grows  modestly  in  this  range.  The  movement 

2 2 

of  the  MSE  is  represented  by  the  movement  of  R , where  1 - R 
is  the  ratio  of  the  sum  of  squared  errors  to  the  sum  of  squares  of 
the  dependent  variable.  This  corresponds  to  looking  at  the  degrada- 
tion in  the  square  of  the  multiple  correlation  coefficient,  the  two 
coinciding  when  k = 0. 

Table  1 contains  initial  information  for  the  application  of 
ridge  regression  using  the  chosen  variables.  On  the  left  are  the 
three  kinds  of  variables  W (contract  losses) , G (gains) , and  A 
(attrition).  For  each  then  are  7 LOS  groups  and  7 PG  groups.  Ridge 
regressions  were  performed  for  k starting  at  zero,  advancing  in 
increments  of  .005  until  0.1  is  achieved  and  then  0.2,  0.4,  0.6,  0.8, 
and  1.0. 
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The  ranges  of  values  of  k that  brought  the  VIF  (variance 
inflation  factor)  into  the  range  of  one  to  ten  are  tabulated  next. 
Values  of  k this  small  are  expected  when  all  variables  have  been 
standardized.  (Ref.  6) 

2 

The  column  headed  R (0)  is  the  square  of  the  multiple 
correlation  coefficient  under  ordinary  least  squares  (k  = 0)  . 

It  is  one  minus  the  ratio  of  the  sum  of  squared  residuals  to  the  sum 
of  squares  of  the  dependent  variable.  It's  use  as  a measure  of  the 
percent  of  variance  accounted  for  is  tenuous  in  the  current  appli- 
cation because  these  are  so  few  degrees  of  freedom  to  estimate  the 
variance  of  residuals.  (Ref.  1,  5).  Thus,  large  values  are  en- 
couraging but  not  to  be  depended  upon.  They  measure  the  level  of 
"explanability"  for  this  particular  set  of  data,  but  the  measure 
is  not  reliable  for  prediction.  It  can  be  used  to  measure  the 
change  in  the  sum  of  squares  of  residuals  as  k increases. 

The  remaining  data  in  the  table  give  indications  of  the  degree 
of  "ill  conditioning"  of  the  problem.  The  "min  eigenvalue"  refers 
to  the  correlation  matrix  X'X  of  the  regressor  variables.  For 

orthogonal  data  all  eigenvalues  are  unity.  The  small  values  indicate 

2 

substantial  ill  conditioning.  (Ref.  3,  7).  The  quantity  L (k)  is 

the  squared  length  of  the  regression  coefficient  vector.  It  will 

2 

be  greatest  when  k = 0.  The  L range  values  correspond  to  values 
of  k in  the  preceding  k range.  These  values  have  stabilized  in 
all  cases.  The  asterisks  (*)  denote  those  groups  whose  regression 
coefficients  have  not  stabilized  in  sign  within  the  k range. 

Some  basic  data  for  our  42  cases  are  contained  in  Table  2. 
Following  the  designator  columns  are  the  means,  standard  deviations, 
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and  coefficients  of  variation  (ratio  of  standard  deviation  to  mean) 
of  the  dependent  variables. 

For  illustrative  purposes  (and  for  fun)  it  was  decided  to 
display  a complete  set  of  regression  predictions  for  1976.  The  value 

k = .05  was  chosen  arbitrarily  and  applied  uniformly.  The  values 

2 2 
R (.05)  should  be  compared  with  the  R (0)  values  of  the  earlier 

table  to  indicate  the  growth  of  the  sum  of  squared  residuals  as  k 

increases.  The  last  two  columns  contain  the  1976  forecasts  and  their 

root  mean  square  errors  (sum  of  squared  residuals  over  n - p - 1 

raised  to  one-half  power) . Because  of  the  correction  for  degrees 

of  freedom  the  RMSE  is  actually  measurably  larger  than  the  dependent 

variable  standard  deviation  in  five  of  the  cases.  (W  Los  1,  W Los  5, 

G Los  2,  A Los  2,  A Los  6-17.)  In  thirteen  of  the  cases  it  is 

dramatically  smaller--this  is  especially  notable  because  all  of  the 

five  figure  standard  deviations  are  converted  to  four  figure  RMSE 

values,  and  several  four  figure  standard  deviations  are  reduced 

to  three  figures.  The  remaining  cases  are  in  the  range  of  no  change 

to  modest  improvement. 

Finally  the  corresponding  regression  coefficients  (converted 
back  to  the  original  dimensions)  appear  in  Table  3.  These  values  need 
interpreting,  i.e.,  why  should  contract  losses  be  negatively  correlated 
with  separations  in  some  cases  and  positively  in  others,  etc.  These 
questions  may  have  rational  answers,  or  they  may  indicate  the  need 
for  a better  selection  of  variables. 

Discussion 

The  current  exploratory  work  should  be  continued — seeking 
better  sets  of  regression  variables  (not  necessarily  uniform  in 
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kind  across  the  cases) , individualized  values  of  the  ridge  constant 
(k) , and  perhaps  some  refined  modeling.  Also  there  may  exist  some 
important  exogeneous  variables  (e.g.,  dates  of  major  policy  changes, 
planning  targets  for  the  size  of  the  Navy,  the  unemployment  rate) , 
but  they  may  be  hard  to  identify. 
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0. 

.1 
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0. 

.1 
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.01, 
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4 

. 005, 

.1 

5 

0. 

.1 
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0 . 

.1 

7 

0. 

.1 

8 

0. 

.05 

9 

0. 

.1 

LOS  1 

0. 

.1 

2 

0. 

.05 
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0. 

.02 

4 

0. 

.1 
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0. 

.1 
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0.  , 

.1 
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0. 

.05 
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.01  , 

.1 

4 

0. 

.1 

5 

0. 

.1 

6 

0. 

.02 

7 

0. 

.1 

8 

0. 

.05 

9 

0. 

.1 

" (0) 

Min . 

Eigenvalue 

(0) 

L^  range 

.41 

.006 

24.6 

.55, 

2.47 

.94 

.002 

55.7 

.74, 

.93 

.98 

.005 

4.6 

.93, 

1.11 

.99 

.012 

1.0 

.90, 

.93 

.36 

.005 

32.3 

.94, 

2.3 

.91 

.104 

1.1 

.72, 

1.12 

.55 

.058 

.9 

.53, 

.89 

.84 

.002 

26.8 

.57, 

1.10 

.96 

.021 

1.1 

.•73, 

1.03 

.74 

.010 

.8 

.70, 

.77 

.65 

.087 

1.6 

.85, 

1.63 

.81 

.060 

.5 

.42, 

.49 

.45 

.101 

.8 

.65, 

.78 

.95 

.259 

1.6 

1.10, 

1.64 

.99 

.035 

4.3 

.76, 

2.86 

.85 

.028 

4.7 

1.01, 

3.47 

.62 

.096 

1.2 

.57, 

.75 

.79 

.138 

6.3 

.68, 

1.57 

.73 

.062 

4.6 

1.02, 

4.62 

. 68 

.119 

1.2 

.89, 

1.19 

.97 

.059 

4.0 

1.11, 

4.03 

.99 

.030 

4.2 

.78, 

3.47 

.41 

.041 

. 5 

.34, 

.46 

.62 

.062 

1.4 

.90, 

1.41 

.73 

.041 

.8 

.56, 

.78 

.69 

.120 

1.2 

.90, 

1.16 

.67 

.210 

1.3 

.96, 

1.26 

.71 

.265 

1.8 

.49, 

1.83 

.96 

.311 

1.4 

.94, 

1.37 

.24 

. 126 

.3 

.25, 

.33 

.69 

.253 

1.0 

.86  , 

.95 

.78 

.295 

.5 

.44, 

.51 

.59 

.109 

1.2 

. 66 , 

1.18 

.42 

.164 

.6 

.35, 

.57 

.33 

.094 

2.7 

1.22, 

2.72 

.93 

.039 

.8 

.60, 

.78 

.94 

.110 

.8 

.53, 

.79 

.85 

.185 

1.1 

.70, 

1.13 

.59 

.119 

.9 

.75, 

.90 

. 56 

.251 

1.1 

.62, 

1.06 

.65 

.425 

.8 

.68, 

.78 

. 55 

.277 

.9 

.62, 

.92 
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Dep 

Var  Group  Mean 


Std 

Dev 


Coef  Forecast 

Var  R^(.05)  1976(k=.05)  RMSE 


LOS  1 

889.9 

954.6 

1.07 

.291 

569.8 

1137.1 

2 

10545.6 

6410.5 

.61 

.847 

6051.1 

3543.6 

3 

21957.2 

11497.9 

.52 

.959 

3131.8 

3298.6 

4 

47718.0 

16041.8 

.34 

.988 

21117.8 

2474.9 

5 

8832.9 

2833.4 

.32 

.224 

6589.1 

3529.7 

6-17 

9853.2 

2614.7 

.26 

.901 

3712.5 

1163.4 

18-31 

875.3 

283.3 

.32 

.547 

756.0 

269.7 

PGl-3 

27683.2 

8296.4 

.30 

.791 

18604.2 

5362.6 

4 

44525.9 

12773.1 

.29 

.954 

28005.3 

3895.3 

5 

24146.5 

10168.7 

.42 

.742 

10955.0 

7301.7 

6 

3298.4 

747.3 

.23 

.634 

2463.3 

639.3 

7 

837.1 

316.2 

.38 

.810 

249.5 

194.7 

8 

121.7 

32.4 

.27 

.451 

77.7 

33.9 

9 

59.2 

18.7 

.32 

.948 

13.1 

6.0 

LOS  1 

3482.3 

2476.2 

.75 

.945 

7800.8 

1010.6 

2 

1978.9 

1185.5 

.60 

.408 

5003.2 

1490.0 

3 

1401.3 

508.8 

.36 

.614 

2704 . 7 

516.1 

4 

1508.0 

483.4 

.32 

.734 

1521.9 

406.9 

5 

1108.8 

207.9 

.19 

.685 

1049.2 

190.5 

6-17 

5518.1 

1242.5 

.23 

.666 

3412.0 

1173.4 

18-31 

935.6 

323.5 

.36 

.693 

929.2 

291.2 

PGl-3 

8531.9 

4707.1 

.55 

.951 

19424.3 

1708.9 

4 

2675.9 

650.4 

.24 

.791 

2803.4 

485.4 

5 

2508 . 0 

379.4 

.15 

.445 

2229.4 

461.2 

6 

1233.8 

302.4 

.25 

.669 

825.1 

284.1 

7 

802.0 

321.8 

.40 

.801 

756.7 

234 . 5 

8 

117.4 

38.4 

.33 

.844 

80.0 

24.8 

9 

64.0 

21.7 

.34 

.623 

52.1 

21.8 

LOS  1 

20545.1 

7329.1 

.36 

.957 

31200.8 

2138.9 

2 

10131.1 

3000.8 

.30 

.243 

9611.2 

3691.4 

3 

5972.9 

1499.3 

.25 

.686 

6606.8 

1187.5 

4 

3001.3 

961.3 

.32 

.779 

1785.7 

638.5 

5 

1095.2 

251.7 

.24 

.581 

721.4 

230.5 

6-17 

5767.1 

1617.6 

.28 

.419 

7608.1 

1743.5 

18-31 

876.1 

105.5 

.12 

.297 

764.0 

125.0 

PGl-3 

35783.0 

8841.4 

.25 

.924 

50379.2 

3447.0 

4 

5345.8 

1839.6 

.34 

.932 

3924.1 

678.0 

5 

3277.1 

1171.3 

.36 

.846 

2583.2 

649.2 

6 

1851.4 

664.6 

.36 

.587 

1074.7 

604.1 

7 

878.6 

377.2 

.43 

.557 

881.9 

355.0 

8 

189 . 0 

71.8 

.38 

.653 

227.7 

59.9 

9 

64.0 

13.6 

.21 

.542 

72.6 

13.1 
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r 


Dep 

Var  Group 


Regression  Coefficients  of  the 
Indicated  Variables  (k=.05) 


W 


W 


G 


G 


A 


A 


Const. 

U 

V 

S 

T 

LOS  1 

~ a 5 . b j 7 

0. 330288 

0 . 6 3 3 5 3 4 

"0.353212 

0 . 0 1 7 j 1 7 

2 

"332  J . 404 

"0.23541 

0.  Jj2801 

0.04428 

0 . 0 u 5 J 3 j 

3 

3 5 6 3. u u a 

"2. 4o084o 

"O. 12472 

0.078764 

0.315323 

4 

13566.40 

~0  . 746  57  3 

0.148314 

"O.  014002 

0 . 7 j 3 J 1 4 

5 

" 1 6 3 . 2 5 8 

"2.282954 

0.467361 

"0.168073 

0 . 318064 

6-17 

"4343.373 

"11. lo 9 272 

0 . 130415 

0.021375 

0.109361 

18-31 

1437 . 679 

0. 007831 

"0.058284 

"0.101243 

3.006547 

PGl-3 

11737.36 

"O. 193937 

0.568101 

0. 2 29840 

"O. 004b64 

4 

"2303. 351 

"14.6353 

"0.086057 

"0.103196 

0.529383 

5 

"45036.331 

12.058667 

0.253253 

0.07591 

0.530159 

6 

440  5 . 6 

"5.851836 

0.4356 

"O.  1572  38 

0. 002881 

7 

"334.23a 

"0.893586 

0. 371512 

"0.06209 

0.053288 

8 

3.105 

"l  . 01  30  38 

0.411105 

"O  . 034  354 

0 . 0 1 6 4 J 4 

9 

53.102 

0 . 8 4 2 6 6 8 

1.052866 

"O. 048284 

"O. 007138 

Const . 

G 

A 

U 

S 

T 

LOS  1 

"42 /3 . 732 

0 . 0 3 3 3 3 3 

0.393833 

"2.730136 

0.462755 

b . 0 0 u 3 J 7 

2 

633. 332 

0 . 8 3 5 2 6 7 

0.055717 

0.00732 

0 . 0 8 d 0 1 0 

0 . 0027)4 

3 

1631.658 

0.779372 

"0.150336 

0.  07467  1 

"0.322231 

" 0 . u 0 2 0 b 7 

4 

1765 . 65G 

0.322559 

0 . 304o02 

"0.016921 

0. 0018  38 

0 . 0 3 1 0 c 

5 

428.317 

0.638435 

"0.413748 

"0.202134 

0.011753 

0 . u 1 7 2 15 

6-17 

10478. 53 

" 0 . G o 7 4 7 9 

"0.489432 

"2  . 1 902  18 

0 . 140414 

"O. Cl  )u3b 

18-31 

"341.382 

"0.055081 

1.402133 

0.  8602  08 

" 0 . 0 3 H 0 0 2 

0 . 0 1 u 7 7 H 

PGl-3 

6 0 6 3 . 3 3 

0 . 453463 

0. 283034 

"0.243511 

"0.152454 

" 0 . 0 1 J 3 3 3 

4 

•3  354.3  3 3 

"O.  ‘404  324 

"O. 285319 

0.633956 

"O.  004342 

"O. 010881 

5 

2 4 7 8 . 8 1 

"0.231338 

"0.156558 

"0.258073 

0.001318 

0. 010872 

6 

"l  0 55.742 

"0.207115 

"0.228523 

1.245263 

"0.003333 

0.046334 

7 

867.384 

"0.203065 

0.418086 

"l  .6  3422  5 

"0.399786 

0. 018  lb5 

8 

"227. 022 

0 . 167405 

0.176314 

0. 320814 

"0.049615 

0 . 0 4 0 2 5 

9 

68.438 

0 . G 4 3 8 1 8 

0 . 301776 

3.739978 

"0.038501 

0.03  ) 2 2 1 

Const . 

A 

U 

X 

T 

LOS  1 

32036.51 

0. 754403 

"0.401486 

0 . 8 2 u 6 5 5 

"O. 069612 

2 

"1762.357 

0.439853 

2.202321 

"0.162965 

0.037994 

3 

"4252 . 33 

0.460742 

3. 664622 

"O. 013841 

0.05637 

4 

"5700.411 

0.1135 

3.283925 

0.  046  024 

0.085348 

5 

2597.368 

"0.246084 

0.234733 

"O.  1 533  88 

"0.006043 

6-17 

1115. 538 

"O .38336 

0.232208 

"0.054455 

"0.085045 

18-31 

26.042 

0.245201 

3.8736 

0. 026573 

"O . 001508 

PGl-3 

"2625.808 

0.316001 

"2  . 5 0762  3 

"29. 055833 

0.  104  343 

4 

o853. 783 

0.238235 

0.002521 

"1.532162 

0. 022406 

5 

5675.728 

"O.  2073  34 

0.497185 

"0.51389 

0.024126 

6 

867.841 

0.300238 

0.000633 

"O. 118133 

0.027116 

7 

423.653 

0.674542 

"0.613689 

"0.005627 

0. 007372 

8 

7734.112 

0. 151657 

6.  408  782 

"0.172355 

"0.00384 

9 

429.847 

"O. 136835 

0 .7791  11 

"0.021464 

0.01474 
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APPENDIX  B 


LOS 


ELEVEN  YEAR  AVERAGES 


TABLE  B.l 

Average  Separations 


El-3 

E4 

E5 

E6 

E7 

E8 

E9 

2057 

1254 

110 

1 5 

1 1 

2 

2 

072  0 

7888 

0 80 

41 

1 2 

2 

1 

7183 

1183  7 

4210 

43 

1 8 

3 

2 

85  3 5 

241  07 

1 7702 

148 

2 5 

4 

2 

740 

201  4 

2512 

159 

32 

4 

2 

404 

017 

2 089 

580 

. 4 0 

4 

3 

111 

340 

110  7 

015 

48 

8 

3 

79 

208 

1 282 

14  33 

80 

7 

4 

42 

2 40 

1278 

188  0 

1 07 

6 

4 

40 

220 

1078 

1712 

242 

6 

4 

18 

97 

488 

1 088 

278 

1 0 

4 

14 

87 

44  3 

11  38 

308 

25 

4 

7 

4 5 

350 

1200 

8 75 

88 

6 

8 

51 

3 84 

14  3 4 

8 44 

1 03 

9 

7 

48 

374 

1548 

1145 

182 

25 

6 

40 

270 

1 1 75 

1003 

188 

32 

5 

2 4 

1 4 0 

7 00 

817 

213 

52 

3 

1 5 

1 08 

521 

681 

1 88 

55 

2 

1 1 

81 

280 

543 

185 

68 

2 

8 

38 

180 

428 

1 62 

88 

1 

8 

10 

0 5 

280 

128 

85 

1 

4 

1 4 

84 

102 

9 4 

54 

1 

3 

11 

4 5 

1 48 

81 

60 

1 

2 

0 

3 3 

128 

77 

57 

1 

T 

5 

28 

1 02 

66 

59 

1 

T 

5 

1 5 

78 

54 

48 

1 

1 

2 

9 

57 

42 

47 

1 

1 

2 

8 

38 

30 

29 

1 

1 

1 

3 

1 7 

9 

11 

1 

1 

1 

3 

1 0 

7 

9 

1 

1 

2 

2 

1 1 

7 

6 

BESrAVAIlABlE  COPY 


TABLE  B.2 


AVERAGE  ELIGIBLES 


I 

I 


El-3 

E4 

E5 

E6 

E7 

E8 

E9  1 

1 3 07 

13  3 4 

116 

14 

9 

1 

1 

61  04 

7 5 7 3 

9 60 

40 

1 0 

1 

0 

503R 

11699 

4167 

4 1 

1 6 

1 

1 

5666 

3377  0 

17690 

1 44 

34 

3 

1 

579 

19  63 

34  66 

1 57 

31 

3 

1 

LOS  351 

69  6 

3050 

5 56 

36 

3 

3 

91 

3 39 

1 1 69 

911 

47 

5 

2 

6 3 

366 

1371 

1436 

79 

6 

3 

33 

34  0 

13  66 

167  3 

196 

5 

3 

39 

331 

10  6 5 

1706 

34  1 

5 

3 

14 

9 4 

46  3 

1061 

376 

9 

3 

11 

64 

440 

11  33 

39  7 

24 

3 ; 

5 

43 

35  5 

1396 

673 

6 5 

5 

5 

49 

360 

1439 

64? 

1 02 

6 

6 

4 5 

370 

1544 

1143 

161 

23 

5 

36 

37  5 

1173 

1001 

1 67 

31 

4 

33 

136 

707 

616 

212 

51 

7 

14 

107 

519 

679 

167 

54 

1 

1 0 

59 

376 

541 

164 

6 7 

1 

7 

3 5 

1 67 

42  3 

161 

67 

0 

5 

1 6 

93 

276 

137 

64 

0 

3 

1 3 

61 

169 

93 

53 

0 

0 

1 0 

4 3 

1 44 

79 

59 

0 

1 

6 

30 

124 

76 

55 

0 

1 

4 

34 

9 7 

6 5 

56 

0 

1 

3 

12 

71 

52 

47 

0 

0 

1 

6 

53 

40 

46 

0 

0 

1 

4 

3 3 

26 

36 

0 

0 

0 

3 

14 

6 

10 

0 

0 

0 

3 

6 

6 

6 

0 

0 

1 

1 

9 

6 

5 
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TABLE  B.3 


AVERAGE  NON- REENLISTMENTS 


El-3 

E4 

E5 

E6 

E7 

E8 

E9 

11<3  0 

1134 

1 02 

11 

9 

2 

1 

5841 

6 34  3 

647 

3 0 

1 0 

2 

1 

4639 

9 60  0 

2 59  4 

3 0 

16 

2 

2 

52  60 

21297 

13  4 3 4 

52 

22 

4 

2 

432 

1536 

1711 

79 

28 

4 

2 

127 

525 

13  61 

3 54 

34 

3 

3 

56 

190 

566 

4 04 

37 

6 

3 

43 

154 

511 

460 

37 

6 

4 

19 

91 

39  6 

428 

38 

5 

4 

24 

83 

34  7 

397 

5 0 

5 

3 

6 

27 

107 

1 82 

55 

5 

3 

6 

21 

82 

142 

52 

7 

4 

3 

11 

55 

1 04 

47 

5 

3 

4 

1 3 

4 5 

99 

46 

7 

3 

3 

9 

35 

78 

40 

6 

3 

2 

6 

21 

53 

35 

7 

2 

2 

3 

11 

2 4 

23 

5 

2 

1 

2 

7 

1 9 

21 

6 

3 

1 

2 

4 

10 

19 

4 

3 

1 

2 

5 

9 

19 

5 

3 

1 

2 

3 

6 

1 3 

3 

2 

1 

1 

2 

5 

11 

3 

2 

1 

1 

2 

4 

14 

3 

1 

1 

1 

2 

5 

16 

3 

3 

1 

1 

1 

4 

1 3 

3 

3 

1 

1 

1 

3 

11 

3 

2 

1 

1 

1 

2 

9 

3 

2 

1 

1 

1 

1 

6 

2 

2 

1 

1 

1 

1 

4 

2 

1 

1 

1 

1 

1 

3 

1 

1 

1 

1 

1 

1 

7 

3 

1 
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APPENDIX  C 


APL  programs 

The  main  programs  are  REGR  and  RESID  which  perform  the 
ridge  regression  computations  and  develops  the  residuals  and 
their  properties  (resp.).  These  are  used  in  FCAST  which  compute 
the  regression  forecast  for  the  entire  31  by  7 set.  Individual 
cell  forecasts  are  computed  by  PRED.  The  other  programs  prepare 
the  data. 

More  explicitly,  the  raw  data  consist  of  five  11  by  31 
by  9 arrays  Dl,  D2 , ...  , D5  which  carry  the  eleven  year 
values  of  non-reenlistments,  separations,  eligibles,  retentions, 
inventory,  (resp.).  The  program  COMPRESS  merely  telescopes 
pay  grades  El,  E2 , E3  together  producing  11  x 31  x 7 arrays. 

This  must  be  done  separately. 

The  function  FCAST  requires  an  explicit  input  vector 
CR  which  is  the  set  of  partition  boundaries  for  grouping  the 
range  space  of  the  217  eleven  year  means  of  the  object  variable. 
It  also  uses  the  data  array  DD  implicitly. 

VCnMPRESSLUlV 

V 7-<-rOMPRFRr  D\Vi7\ 

ri]  f<h(pp)po 
r2]  FC  ; ; 1 2 

[3]  Zl.4.+  /yx/) 

[4]  0 0 3 

[SI  7.^7.^  ,7. 

V 
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! 

! 


V PP^FOAST  FRxJ\P 

[11  l'-^-((prP).3l  ,7)p0 

[2]  PP->-  21  7 pO 

[3]  J-hO 

[4]  /,l:,r-H,r+i 

[5]  VLJ  ; ; X/^P^rpr^+l  1 

[6]  POPPED  VLJ-,  ;] 

[7]  PP^PP+P 

[R]  -*r,1  X iJ<~i+pCP 

V 


available  copy 


The  function  FCAST  creates  a 31  by  7 screening  matrix 
of  zeros  and  ones  which  is  used  by  the  data  shaping  functions 
PREP,  BUILD,  and  SHAPE.  The  latter,  SHAPE,  merely  shaves  off 
the  first  or  last  face  (according  to  whether  the  variable  Q 
is  minus  one  or  plus  one)  of  the  individual  data  arrays.  The 
function  BUILD  assembles  the  object  variable  in  its  first  face 
and  the  p regression  variables  in  the  remaining  faces.  This 

I 

is  the  only  function  that  needs  to  be  changed  with  each  appli- 
cation. The  version  shown  is  for  regressing  non-reenlistments, 
on  inventory,  previous  non-reenlistments,  and  previous  reten- 
tions. The  output  of  PREP  prepares  the  assembled  data  for 
REGR. 
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vpRFPrniv 

V P*-PRF.D  V \P^^ 

[11  D-PRFP  V 

[2  1 R->-Fr  RFGR  P 

[3  1 F-P  R^FTD  P 

[4l  P-H  21  7 

V 


dusa. 


vp,'?f:pcn]  V 

V D*-PP!^P  V 
[1]  D-^BUIhP  V 

^9.1  X^PC(pP  )Cn  ;;l  + i (pP  )C3l-n 
[3]  P-^(1,(10x(pP)[2]),(dP)[31)pP 
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The  inputs  to  REGR  are  the  data  matrix  D and  the 
vector  of  ridge  constants  RC . The  rank  of  D is  three; 
the  first  dimension  (faces)  indexing  the  problems  (separate 
predictions) , the  second  (rows)  the  observations  (years) , 
and  the  third  (columns)  the  variables — the  object  variable 
coming  first  followed  by  the  regression  variations.  The  output  R 
also  has  three  dimensions.  Again  the  faces  are  the  prediction 
problems.  For  any  face,  the  first  pRC  element  of  column  one 
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are  the  coefficients  of  determination  (i.e.  square  of  mult, 
correl . ) and  the  first  pRC  elements  of  the  last  column  are 
the  values  of  the  ridge  constant.  All  intermediate  columns 
contain  the  standardized  regression  coefficients  in  the  same 
order  as  they  appear  in  D.  The  last  two  rows  contain  the 
mean  and  standard  deviation  (resp.)  of  the  object  and  re- 
gression variables.  Zeros  are  used  to  square  up  the  array. 
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The  function  RESID  takes  as  input  the  data  matrix  D 
and  the  output  R of  REGR.  Its  explicit  output  E is  the 
set  of  residuals.  Implicit  output  includes  EB , the  mean 
residual;  SEE,  the  root  mean  square  error  of  forecast;  and 
BB , the  regression  coefficients  converted  back  to  dimensional 
form.  The  array  C is  the  constant  term  of  the  regression 
equations.  Line  15  shows  that  only  the  positive  parts  of  the 
predictions  are  used. 
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